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ABSTRACT 

Hosts and pathogens are locked in an 
evolutionary arms race. To infect mice, mouse 
hepatitis coronavirus (MHV) has evolved to 
recognize mouse CEACAM la 
(mCEACAM 1a) as its receptor. To elude 
MHV infections, mice may have evolved a 
variant allele from the Ceacam/a gene, called 
Ceacam1b, producing mCEACAM 1b that is a 
much poorer MHV receptor than 
mCEACAM 1a. Previous studies showed that 
sequence differences between mCEACAM 1a 
and mCEACAM 1b in a critical MHV-binding 
CC’ loop partially account for the low receptor 
activity of mCEACAM 1b, but detailed 
structural and molecular mechanisms for the 
differential MHV receptor activities of 
mCEACAM 1a and mCEACAM 1b remained 
elusive. Here we have determined the crystal 
structure of mCEACAM 1b, and identified the 


structural differences and additional residue 
differences between mCEACAM 1a and 
mCEACAM 1b that affect MHV binding and 
entry. These differences include 
conformational alterations of the CC’ loop as 
well as residue variations in other MHV- 
binding regions, including B-strands C’ and C” 
and loop C’C”’. Using pseudovirus entry and 
protein-protein binding assays, we show that 
substituting the structural and residue features 
from mCEACAM 1b into mCEACAM 1a 
reduced the viral receptor activity of 
mCEACAM 1a, whereas substituting the 
reverse changes from mCEACAM 1a into 
mCEACAM 1b increased the viral receptor 
activity of mCEACAM 1D. These results 
elucidate the detailed molecular mechanism 
for how mice may have kept pace in the 
evolutionary arms race with MHV by 
undergoing structural and residue changes in 
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the MHV receptor, providing insight into this 
possible example of pathogen-driven 
evolution of a host receptor protein. 


According to the Red Queen 
hypothesis, hosts and pathogens are in an 
evolutionary arms race to keep pace with each 
other for fitness and survival (1,2). 
Coronaviruses are a large family of ancient 
and diverse RNA virus pathogens that infect 
many mammalian and avian species (3,4). 
Different coronaviruses use a variety of cell 
surface receptors for entry into host cells 
through the activities of virus-surface spike 
proteins (5,6). The host receptor-adapting 
evolution of coronavirus spike proteins has 
been extensively studied (3-6), but 
coronavirus-driven evolution of host receptors 
is much less well understood. The current 
study investigates how a host receptor may 
undergo molecular changes under possible 
selective pressure from lethal coronavirus 
infections and how these changes may help the 
host to resist death from coronavirus 
infections. 

As the prototypic member of the 
coronavirus family, mouse hepatitis 
coronavirus (MHV) presents a good model 
system for studying the co-evolutionary 
relationship between viruses and hosts. 
Depending on the strain, MHV can cause 
enteric, respiratory, or brain infections in 
mice. The enterotropic strains of MHV spread 
widely in susceptible mouse populations and 
are lethal in infant mice of many inbred strains 
(up to 100% fatality) (7). Infection with MHV 
is a major concern in laboratory mice because 
it can disrupt mouse-based research through 
clinical disease and/or alteration of 
immunologic responses (8). MHV uses a cell- 
surface protein, mouse Carcinoembryonic 
Antigen-related Cell Adhesion Molecule la 
(mCEACAM 1a), as its host receptor (9,10). 
The CEACAM1 protein is widely expressed in 
all mammals on the membranes of epithelial 
cells, endothelial cells, and leukocytes (11). It 
mediates cell-cell adhesion and signaling, and 
participates in the differentiation and 
arrangement of tissue three-dimensional 
structure, angiogenesis, apoptosis, tumor 


suppression, cancer metastasis, and the 
modulation of innate and adaptive immune 
responses (12). The envelope-anchored MHV 
spike glycoprotein specifically recognizes 
mCEACAM 1a through the N-terminal domain 
of its S1 subunit (S1-NTD) (9,13,14). Our 
previous structural studies revealed that 
coronavirus S1-NTDs have the same tertiary 
structural fold as human galectins (galactose- 
binding lectins), and that whereas MHV S1- 
NTD recognizes mCEACAM1, bovine 
coronavirus (BCoV) S1I-NTD recognizes 
sugar (15,16). Thus, we proposed that 
coronaviruses acquired a host galectin gene 
and inserted it into its spike protein gene, and 
that whereas BCoV S1-NTD has kept its 
original sugar-binding lectin activity, MHV 
S1-NTD has evolved novel mCEACAM la- 
binding affinity and lost its original sugar- 
binding lectin activity. These studies have 
provided insight into the host receptor- 
adapting evolution of coronaviruses (5,6). 

To respond to the selective pressure 
from lethal MHV infections, mice may have 
evolved a variant allele from the Ceacamla 
gene, called Ceacam 1b; of the two gene 
products, mCEACAM 1b is a much less 
efficient MHV receptor than mCEACAMla 
(17-19). Correspondingly, mice homozygous 
for Ceacam1b (1b/1b) are resistant to death 
from MHV infections, while mice 
homozygous for Ceacam1a (1a/1a) are highly 
susceptible to lethal MHV infections 
(7,9,10,20,21). Other than their different MHV 
receptor activities, mCEACAM la and 
mCEACAM 1b appear to be functionally 
equivalent: neither /a/la mice or 1b/1b mice 
show any growth defects, while a 
dysfunctional Ceacam1 gene leads to impaired 
insulin clearance, abnormal weight gain, and 
reduced fertility (22). Our previous structural 
studies of mCEACAM 1a and its complex with 
MHV SI1-NTD have delineated detailed 
interactions between mCEACAM 1a and MHV 
S1-NTD (15,23). Moreover, previous studies 
identified the CC’ loop (loop that connects B- 
strands C and C’) in mCEACAM 1a as critical 
for MHV binding; the sequence of this loop 
diverges in mCEACAM 1D, partially 
accounting for the low MHV receptor activity 
of mCEACAM 1b (17,24). However, due to 
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the lack of structural information about 
mCEACAM 1b, it was not known what 
structural differences between mCEACAM 1a 
and mCEACAM 1b or whether additional 
residue differences between mCEACAM 1a 
and mCEACAM |b account for the MHV 
resistance in mice homozygous for Ceacam 1b. 


In this study, we have determined the 
crystal structure of mCEACAM 1b, and 
elucidated the structural differences and 
additional residue differences between 
mCEACAM 1a and mCEACAM 1b that 
impede the binding of MHV SI-NTD to 
mCEACAM Ib. Moreover, we have performed 
structure-guided mutagenesis studies on 
mCEACAM 1a and mCEACAM 1b to 
investigate the significance of their structural 
and sequence differences upon their MHV 
receptor activities. These results provide 
insight into the possibility that MHV has 
driven the evolution of the mCEACAM1 
protein in mice. 


RESULTS 

Because of alternative mRNA 
splicing, mCEACAMI contains either two 
[D1 and D4] or four [D1-D4] Ig-like domains 
in tandem, in addition to a transmembrane 
anchor and a short intracellular tail at its C- 
terminus (12). mCEACAM1b[D1,D4] 
(residues 1-202) without the membrane anchor 
or the intracellular tail was expressed and 
purified as previous described for 
mCEACAM la[D1,D4] (15). It was 
subsequently crystallized in space group 
P3,21, a=113.1A, b=113.1A, and c=64.4A. 
Although each asymmetric unit of the crystal 
contains two mCEACAMIb[D1,D4] 
molecules, the protein is a monomer in 
solution based on gel filtration 
chromatography. The structure was 
determined by molecular replacement using 
the structure of mCEACAM la[D1,D4] as the 
search template, and refined at 3.1A resolution 
(Fig. 1A; Table 1). The final model contains 
all of the residues in domains D1 and D4, and 
a glycan N-linked to Asn270. 

The overall structure of 
mCEACAM1b[D1,D4] is similar to that of 
mCEACAM la[D1,D4], but the structural 


similarity is uneven in different regions of the 
two proteins. In both mCEACAM 1a and 
mCEACAM 1b, the two Ig-like domains, D1 
and D4, are arranged in tandem without any 
significant interactions with each other (Fig. 
1A, 1B). In mCEACAM 1a, the D1 domain 
binds to MHV S1-NTD, whereas the D4 
domain has no contact with MHV S1-NTD 
(Fig. 1C). In the D1 domain of mCEACAM 1a, 
several loops (CC’, C’C”, C”D, and FG) and 
B-strands (BC, BC’, and BC”) are directly 
involved in MHV binding, and thus these 
regions have been called the virus-binding 
motifs (VBMs) (Fig. 1C, 1D). Interestingly, 
the D1 domains of mCEACAM 1a and 
mCEACAM 1b are significantly more 
divergent in both primary structure (sequence 
identity = 74%) and tertiary structure (main 
chain RMSD = 1.11 A) than the D4 domains 
(sequence identity = 98%; main chain RMSD 
= 0.77 A) (Fig. 2). Furthermore, within the D1 
domain, the VBMs of mCEACAM 1a and 
mCEACAM 1b are more divergent in both 
primary structure (sequence identity = 56%) 
and tertiary structure (main chain RMSD = 
1.36 A) than the non-virus-binding regions are 
(Fig. 2C, 2D). These results suggest that 
compared with the rest of the protein, the 
VBMs in the D1 domains of mCEACAM 1a 
have been under strong selective pressure 
possibly from MHYV infections. 

Further inspection of the structures of 
mCEACAM 1a and mCEACAM 1b has 
identified detailed structural divergence 
between the VBMs of the two proteins. In 
mCEACAM la, a critical CC’ loop (loop that 
connects B-strands C and C’) in the D1 
domain interacts extensively with MHV S1- 
NTD and thus plays a prominent role in the 
virus/receptor binding interactions (Fig. 3A). 
These interactions include the multiple 
hydrophobic interactions between the side 
chain of receptor He41 and the side chains of 
MHV Tyr15, Leu89, and Leu160 as well as 
the hydrogen bonds between the carbonyl 
oxygen of receptor Thr39 and the side chain of 
MHV Arg20. Compared to mCEACAM 1a, 
mCEACAM 1b has undergone significant 
structural changes in loop CC’ (Fig. 3B, 3C), 
which result from residue changes in this loop. 
For example, residue 38 is a threonine in 
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mCEACAM 1a but a proline in 
mCEACAM Ib. This residue change likely has 
a significant impact on the conformation of 
loop CC’ because prolines are known to cause 
changes to protein secondary structures. 
Additionally, a number of residues in other 
VBM regions of mCEACAM 1a also form 
critical interactions with MHV SI-NTD, but 
have been substituted with different residues 
in mCEACAMIDb. For example, in 
mCEACAM 1a, the side chain of receptor 
Arg47 in strand BC’ forms hydrogen bonds 
with the carbonyl oxygens of MHV Gln23 and 
Val25, the side chains of receptor Met54 and 
Phe56 in strand BC’ are part of a hydrophobic 
cluster at the S1-NTD/receptor interface, and 
the side chain of receptor Asn59 in strand BC’ 
forms hydrophobic stacking with the Ca of 
S1-NTD Gly29 (Fig. 4A, 4B). However, 
Arg47, Met54, Phe56, and Asn59 in 
mCEACAM 1a have been substituted with 
His47, Lys54, Thr56, and Pro59, respectively, 
in mCEACAM |b (Fig. 4C, 4D). The above 
structural and residue changes in the VBMs 
from mCEACAM 1a to mCEACAM 1b would 
lead to the loss of numerous energetically 
favorable interactions at the S1-NTD/receptor 
interface and disrupt the virus/receptor binding 
interactions. These structural analyses further 
suggest that the VBMs in the D1 domain of 
mCEACAM1 have been under strong 
selective pressure possibly from MHV 
infections. 

To investigate how the structural and 
residue differences between mCEACAM 1a 
and mCEACAM lb affect their functions as 
MHV receptor, we carried out structure- 
guided mutagenesis and introduced structural 
and residue features from mCEACAM 1b into 
mCEACAM 1a. These structural and residue 
changes include replacing loop CC’ in 
mCEACAM 1a with its counterpart from 
mCEACAM 1b, and substituting residues 47, 
54, 56, and 59 in mCEACAM 1a with the 
corresponding residues from mCEACAM 1b. 
A pseudovirus entry assay was performed 
where a lentiviral vector pseudotyped with the 
MHYV spike protein was used to enter 
mammalian cells expressing either wild type 
or mutant mCEACAM 1a on their surface. The 
results demonstrated that each of the structural 


and residue features from mCEACAM1b 
introduced into mCEACAM |a significantly 
reduced the efficiency of pseudovirus entry 
(Fig. 5A), reflecting a weaker binding affinity 
between the MHV spike protein and the 
mutant mCEACAM la. Thus, the structural 
and residue changes from mCEACAM 1a to 
mCEACAM lb reduced the capability of 
mCEACAM 1a to serve as the MHV receptor. 
These loss-of-function experiments mimic the 
possible loss-of-function evolution of the 
mouse Ceacam/a gene under the selective 
pressure from MHV infections. 

To further explore the functional 
significance of the structural and residue 
differences between mCEACAM 1a to 
mCEACAM 1b, we introduced the reverse 
substitutions (i.e., the features from 
mCEACAM 1b introduced into 
mCEACAM |a). These structural and residue 
changes include replacing (1) loop CC’, (ii) 
both loop CC’ and strand BC’, or (i1i) all of 
loop CC’, strand BC', loop C'C", and strand 
BC" in mCEACAM 1b with the corresponding 
regions from mCEACAM 1a. In addition to the 
pseudovirus entry assay, protein-protein 
binding assays were also performed between 
the MHV S1-NTD and wild type or mutant 
mCEACAM 1b. The results showed that all of 
the structural and residue changes introduced 
into mCEACAM 1b significantly enhanced 
both the pseudovirus entry efficiency and 
protein-protein binding affinity (Fig. SA, SB). 
Among the mutant mCEACAM1]1b molecules, 
the one containing changes in loop CC’, strand 
BC', loop C'C", and strand BC" all together 
demonstrated the highest MHV receptor 
activity. More specifically, introduction of the 
above structural and residue changes into 
mCEACAM 1b restored the receptor activity 
of mCEACAM 1b up to ~67% of 
mCEACAM 1a based on the pseudovirus entry 
efficiency and ~83% of mCEACAM 1a based 
on the protein-protein binding affinity. It is 
worth noting that incorporation of the above 
structural and residue features from 
mCEACM 1a did not fully restore the MHV 
receptor activity of mCEACAM 1b to the same 
level as mCEACAM 1a, suggesting that 
structural and/or residue differences elsewhere 
in domain D1 may account for the remaining 
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difference between mCEACAM la and 
mCEACAM 1b in their MHV receptor 
activities. Nevertheless, these gain-of-function 
experiments represent the reverse course of 
the possible loss-of-function evolution of the 
mouse Ceacam! gene under the selective 
pressure from MHV infections. 


DISCUSSION 

The Red Queen hypothesis states that 
hosts and pathogens are constantly in an 
evolutionary arms race. Previous structural 
studies of the coronavirus/receptor interactions 
have revealed how coronaviruses have 
evolved a variety of strategies to recognize 
different host receptors for host range 
expansion and cross-species infections 
(5,6,15,16,23,25-31). One of these strategies 
would be for coronaviruses to steal a host 
galectin, which became the S1-NTD of the 
coronavirus spike protein, and use it to bind 
sugar on host cell surfaces for viral attachment 
to host cells. While the S1-NTDs of many 
contemporary coronaviruses still recognize 
sugar receptors (32-38), MHV SI-NTD has 
evolved novel binding affinity for 
mCEACAM 1a protein, which greatly 
enhanced the infection efficiency of MHV in 
mouse cells (15,16) (Fig. 6). Through the 
above evolution of its spike protein, MHV 
appears to have gained a significant edge in 
the evolutionary arms race with mice and 
become a highly infectious and pathogenic 
virus for mice. 

How have mice evolved to keep pace 
in the evolutionary arms race with MHV? An 
interesting observation is that the mouse 
Ceacam1 gene has diversified into two alleles, 
Ceacamla and Ceacam1b. Their protein 
products, mCEACAM 1a and mCEACAM 1b, 
demonstrate different MHV receptor 
activities: mCEACAM 1b is a much poorer 
MHYV receptor than mCEACAM la. 
Consequently, mice homozygous for 
Ceacam1b are highly resistant to death from 
MHV infections. The selective pressures that 
drive the evolution of mammalian Ceacam1 
genes could come from several sources. 
Mammalian CEACAM1 functions in many 
physiological processes including cell-cell 
adhesion, cell signaling, and cell development 


(12). In addition, homan CEACAM1 is a 
receptor for a variety of bacterial pathogens 
(39-41). Thus, the physiological functions of 
CEACAM1, host evasion of bacterial 
infections, or some other unknown selective 
pressures could potentially drive the evolution 
of mammalian Ceacam/ genes. Although the 
physiological functions of mCEACAMI1b 
need to be investigated further, mice 
homozygous for Ceacam/b apparently retain 
all the normal phenotypes of Ceacam 1a, 
suggesting no major alterations of the 
physiological functions of mCEACAM 1b. 
Moreover, mouse CEACAM 1a is not a 
receptor for those bacterial pathogens that use 
human CEACAM1 as the receptor (42). On 
the other hand, MHV infections can be 
devastating to infant mice that express 
mCEACAM 1a. Therefore, while other 
selective pressures cannot be ruled out, MHV 
infection is likely one of the major driving 
forces for the evolution of mouse Ceacam1 
gene. There are insufficient data on mouse 
genomes to prove which one of the mouse 
Ceacam1 alleles evolved first. Based on the 
above discussion, we suggest that in the 
mouse population the Ceacam/a allele 
preceded the appearance and maintenance of 
the Ceacam/b allele in the presence of MHV 
epidemics. 

This study investigates the structural 
and residue differences between 
mCEACAM 1a and mCEACAM 1b that render 
mCEACAM Ib a less efficient MHV receptor. 
Previous studies identified a critical MHV- 
binding loop CC’ that diverges in sequence 
between mMCEACAM 1a and mCEACAM 1b, 
partially accounting for the different MHV 
receptor activities of the two proteins (17,24). 
The current study reveals the altered 
conformation of loop CC’ in the crystal 
structure of mCEACAM 1b, providing a 
structural basis for the critical role of loop CC’ 
in the different MHV receptor activities of the 
two mCEACAMI molecules. Furthermore, 
this study identifies residue variations in 
several other MHV-binding regions in 
mCEACAM |b that render mCEACAM 1b a 
poor MHV receptor. These regions include B- 
strands C’ and C’’ and loop C’C”. Using 
structure-guided mutational and functional 
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assays, this study shows that the structural and 
residue substitutions from mCEACAM 1a into 
mCEACAM 1b cause a loss of receptor 
function in mCEACAM la, whereas the 
reverse substitutions cause a gain of receptor 
function in mCEACAM 1b. The structural and 
residue changes from mCEACAM 1a to 
mCEACAM 1b mimic the possible loss-of- 
function evolution in mCEACAM 1a during 
pathogen-driven host evolution, which would 
result in less severe MHV infections in mice 
and partial alleviation of the selective pressure 
from MHV infections (Fig. 6). Therefore, it is 
likely that through divergent evolution of its 
Ceacam1 gene to generates the Ceacamlb 
allele, mice may have gained the ability to 
keep pace in the evolutionary arms race with 
MHV for fitness and survival. Overall, the 
current study provides insight into a possible 
example of coronavirus-driven evolution of 
mouse receptor protein, lending molecular 
evidence to the Red Queen hypothesis. 


MATERIALS AND METHODS 

Protein preparation and 
crystallization- MCEACAM1b[D1, D4] 
(residues 1-202) was expressed and purified as 
previously described for 
mCEACAM la[D1,D4] (15). Briefly, 
mCEACAMI1b[D1, D4] 
containing a C-terminal His, tag was 
expressed in sf9 insect cells using the Bac-to- 
Bac expression system (Life Technologies), 
and was secreted into cell culture medium. 
The protein was harvested and loaded onto a 
nickel-nitrilotriacetic acid (Ni-NTA) column, 
eluted from the Ni-NTA column with 
imidazole, and further purified by gel filtration 
chromatography on Superdex 200 (GE 
Healthcare). The protein was concentrated to 
10 mg/ml and stored in buffer containing 20 
mM Tris pH7.2 and 200 mM NaCl. 
Crystallization of mCEACAM1b[D1, D4] was 
set up using the sitting drop vapor diffusion 
method, with 1 pl protein solution added to 1 
ul well buffer containing 0.1 M Tris pH6.2, 
10% PEG4000 (v/v), and 1 M NaCl at 20°C. 
Crystals of mCEACAM1b[D1, D4] appeared 
in 2-3 days and were allowed to grow for 
another 2 weeks before they were harvested 
and flash-frozen in liquid nitrogen. 


Data collection and structure 
determination- X-ray diffraction data was 
collected at the Advanced Light Source 
beamline 4.2.2 and processed using HKL2000 
(43). The structure of mCEACAM1b[D1,D4] 
was determined by molecular replacement 
using mCEACAM la[D1,D4] (PDB 3R4D) as 
the search template. The model was built 
using Coot (44) and refined with Refmac (45) 
to a final Ryox and Ree. of 0.216 and 0.273, 
respectively. 

Pseudovirus entry assay- Lentiviruses 
pseudotyped with MHV spike protein were 
produced as previously described (15). 
Briefly, pcDNA3.1(+) plasmid encoding 
MHYV spike protein (from MHV strain A59) 
was co-transfected into HEK293T cells with 
helper plasmid psPAX> and reporter plasmid 
pLenti-GFP at molar ratio 1:1:1 using 
lipofectamine 2000 (Life Technologies). 48 
hours post-transfection, the produced 
pseudovirus particles were harvested and 
inoculated onto the HEK293T cells expressing 
mCEACAM 1a or mCEACAM 1b (wild type or 
mutant). 48 hours post-infection, cells were 
observed under fluorescent microscope, the 
percentage of GFP-expressing cells was 
calculated using ImageJ (National Institutes of 
Health). The expression levels of 
mCEACAM 1a and mCEACAM 1b in 
HEK293T cells were measured by Western 
blotting using antibodies against their C- 
terminal C9 tag, quantified using ImageJ, and 
presented as relative expressions in 
comparison to wild-type mCEACAM 1a. The 
relative expression of each receptor was used 
to normalize pseudovirus entry efficiency. The 
experiments were further repeated twice, and 
similar results were obtained. 

Protein-protein interaction assay 
using AlphaScreen- The interactions between 
recombinant MHV S1-NTD and recombinant 
mCEACAM 1a or mCEACAM 1b (wild type or 
mutant) were measured using AlphaScreen as 
previously described (46,47). Briefly, 300 nM 
MHV S1-NTD with a C-terminal His, tag 
were mixed with 30 nM mCEAMCAM 1a or 
mCEACAM 1b (wild type or mutant) with a C- 
terminal human IgG, Fc tag in OptiPlate-96 
(PerkinElmer) for 1 hour at room temperature. 
AlphaScreen Nickel Chelate Donor Beads and 
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AlphaScreen protein A acceptor beads 
(PerkinElmer) were added to the mixture at 
final concentrations of 20 ug/ml. The mixture 
was incubated at room temperature for 1 hour, 


protected from light. The assay plates were 
read in an EnSpire plate reader (PerkinElmer). 
The experiments were further repeated twice, 
and similar results were obtained. 
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Table 1. Data collection and refinement statistics 


mCEAMCAMIb 


Data Collection 
Space group 
Cell dimensions 

a, b, c (A) 

a, B, y (°) 
Resolution (A) 
Rsym 
I/ol 
Completeness (%) 
Redundancy 


Refinement 
Resolution (A) 
No. reflections 
Rwork!Rérce 
No. atoms 
Protein 
Ligand 
Water 
B-factors (A’) 
Protein 
Ligand 
Water 
RMSD 
Bond lengths (A) 
Bond angles (°) 
Ramachandran plot 
Favored (%) 
Outliers (%) 


P3,21 


113.097, 113.097, 64.380 


90, 90, 120 


16.97-3.10 (3.21- 3.10)* 


0.114 (0.507) 
9.15 (1.78) 
96.3 (86.4) 

2.5 (2.1) 


16.97-3.10 
8557 
0.216 / 0.273 


“Values in parentheses are for highest-resolution shell. 


14 


LIOT ‘T Atenues uo ysons Aq /S10°oqr Mmay/:dyy Wo popeojuMog 


FOOTNOTES 


This work was supported by NIH grant RO1AI089728. Computer resources were provided by the 
Basic Sciences Computing Laboratory of the University of Minnesota Supercomputing Institute. 
Coordinate and structure factors have been submitted to the PDB, accession number 5F1D. 


FIGURE LEGENDS 


FIGURE 1. Overall structure of mouse CEACAMIDb. (A) Crystal structure of 
mCEACAM1b[D1, D4] containing the D1 and D4 domains. Secondary structures in domain D1 
of mCEACAM1b[D1, D4] are named following the mCEACAM1a[D1, D4] structure (23). (B) 
Crystal structure of mCEACAM la[D1, D4] (PDB 1L67) (23). Virus-binding motifs (VMBs) are 
in blue. (C) Crystal structure of mCEACAM la[D1, D4] complexed with MHV S1-NTD (PDB 
3R4D) (16). MHV S1-NTD is in cyan, with RBMs in red. (D) Sequence alignment of 
mCEACAM1b[D1, D4] and mCEACAM 1a[D1, D4]. B-strands are shown as arrows. VBMs in 
mCEACAM 1a and the corresponding regions in mCEACAM 1b are in blue. Asterisks indicate 
positions that have fully conserved residues; colons indicate positions that have strongly 
conserved residues; periods indicate positions that have weakly conserved residues. The boundary 
between domains D1 and D4 is indicated by a black line. Sequence alignment was done using 
CLUSTAL W (48). Structural illustrations were done using PyMol (49). 


FIGURE 2. Structural comparisons of mCEACAM1a and mCEACAIDb. (A) Overlay of 
mCEACAM 1a and mCEACAM 1b in domain D1. mCEACAM 1a is in black, and mCEACAM 1b 
in yellow. VBMs in mCEACAM 1a and the corresponding regions in mCEACAM 1b are in blue. 
(B) Overlay of mCEACAM1a and mCEACAM]1b in domain D4. (C) Structure of mCEACAMIb 
showing distribution of the residues that differ between mCEACAM 1a and mCEACAM 1b. 
Regions corresponding to VBMs in mCEACAM 1a are in blue. Residues that differ between 
mCEACAM 1a and mCEACAM 1b are shown as balls and sticks. (D) Sequence identities and 
RMSDs between mCEACAM la and mCEACAM lb in different regions. See Figure 1D for the 
residue ranges of domains and VBMs. RMSDs were calculated using Coot (50). 


FIGURE 3. Structural comparisons of mCEACAM1a and mCEACAM 1b in MHV-binding 
loop CC’. (A) Interactions between MHV S1-NTD (from MHYV strain A59) and loop CC’ of 
mCEACAM1I1a. mCEACAM residues are in green, and S1-NTD residues in magenta. 
Hydrophobic interactions are indicated as arrows, and hydrogen bonds as dotted lines. (B) 
Structure of loop CC’ of mCEACAM Ib. (C) Overlay of the CC’ loops from mCEACAM 1a and 
mCEACAM Ib. 


FIGURE 4. Structural comparisons of mCEACAM1a and mCEACAM]1b in MHV-binding 
strands BC’ and BC”. (A) Interactions between MHV SI-NTD and Arg47 in strand BC’ of 
mCEACAM 1a. (B) Interactions between MHV S1-NTD and Met54, Phe56, and Asn59 in strand 
BC” of mCEACAM1a. (C) Conformation of the side chain of His47 in strand BC’ of 
mCEACAM 1b. (D) Conformations of the side chains of Lys54, Thr56, and Pro59 in strand BC” 
of mCEACAM 1b. 


FIGURE 5. Structure-guided mutational and functional characterizations of mCEACAMI1a 
and mCEACAMIDb. (A) Pseudovirus entry efficiency. Lentiviruses pseudotyped with MHV 
spike protein (from MHV strain A59) were used to enter HEK293T cells expressing 
mCEACAM la or mCEACAM 1b (wild type or mutant). The relative expression of each receptor 
was used to normalize pseudovirus entry efficiency. The pseudovirus entry mediated by wild-type 
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mCEACAM 1a was taken as 100%. Error bars indicate standard errors of the means (n = 3). 
Comparisons between wild-type mCEACAM 1a and mutant mCEACAM 1a, or between wild-type 
mCEACAM 1b and mutant mCEACAM lb in their capabilities to mediate pseudovirus entry were 
done using two-tailed t-test (**, P<0.01; ***, P<0.001). The mutant mCEACAM 1a molecules 
contain single mutations R47H, M54K, F56T, Q59P, or loop CC’ from mCEACAM |b (residues 
38-43). The mutant mCEACAM 1b molecules contain loop CC’ from mCEACAM 1a (residues 
38-43), loop CC’ and strand BC’ from mCEACAM 1a (residues 38-51), or loop CC’, strand BC’, 
loop C’C”, and strand BC” from mCEACAM 1a (residues 38-59). (B) Protein-protein binding 
assay. The interactions between MHV S1-NTD and mCEACAM la or mCEACAM 1b (wild type 
or mutant) were measured using AlphaScreen assay. MHV S1-NTD with a C-terminal His, tag 
and mCEAMCAM 1a or mCEACAM 1b (wild type or mutant) with a C-terminal human IgG, Fc 
tag were attached to AlphaScreen Nickel Chelate Donor Beads and Alpha Screen protein A 
acceptor beads, respectively. Error bars indicate standard errors of the means (n = 3). 
Comparisons between wild-type mCEACAM 1b and mutant mCEACAM 1b in their binding 
affinity for MHV S1-NTD were done using two-tailed ¢-test (***, P<0.001). 


FIGURE 6. Proposed evolutionary arms race between MHV and mice. The race includes 


both MHV-driven evolution of mice (top) and mouse-adapting evolution of MHV (bottom). See 
text for detailed discussion. 
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Figure 5: 
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